Speaker adaptation based on system combination using speaker-class models
نویسندگان
چکیده
In this paper, we propose a new system combination approach for an LVCSR system using speaker-class (SC) models and a speaker adaptation technique based on these SC models. The basic concept of the SC-based system is to select speakers who are acoustically similar to a target speaker to train acoustic models. One of the major problems regarding the use of the SC model is determining the selection range of the speakers. In other words, it is difficult to determine the number of speakers that should be selected. In order to solve this problem, several SC models, which are trained by a variety of number of speakers are prepared in advance. In the recognition step, acoustically similar models are selected from the above SC models, and the scores obtained from these models are merged using a word graph combination technique. The proposed method was evaluated using the Corpus of Spontaneous Japanese (CSJ), and showed significant improvement in a lecture speech recognition task.
منابع مشابه
Speaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملSpeaker Adaptation in Continuous Speech Recognition Using MLLR-Based MAP Estimation
A variety of methods are used for speaker adaptation in speech recognition. In some techniques, such as MAP estimation, only the models with available training data are updated. Hence, large amounts of training data are required in order to have significant recognition improvements. In some others, such as MLLR, where several general transformations are applied to model clusters, the results ar...
متن کاملTransformation and combination of hiden Markov models for speaker selection training
This paper presents a 3-stage adaptation framework based on speaker selection training. First a subset of cohort speakers is selected for test speaker using Gaussian mixture model, which is more reliable given very limited adaptation data. Then cohort models are linearly transformed closer to each test speaker. Finally the adapted model for the test speaker is obtained by combining these transf...
متن کاملFast Speaker Adaptation of Large Vocabulary Continuous Density Speech Recognizer Using a Density Combination Approach
Maximum Likelihood transformationadaptation techniques have proven successful, but it is believed that faster convergence to speaker dependent (SD) performance can be achieved if we incorporate some form of a-priori knowledge in the adaptation process. In this paper, instead of estimating one linear transform per class of models for each new speaker, we transform the speaker-independent (SI) mo...
متن کاملAn online incremental speaker adaptation method using speaker-clustered initial models
We previously proposed an incremental speaker adaptation method combined with automatic speaker-change detection for broadcast news transcription where speakers change frequently and each of them utters a series of several sentences. In this method, the speaker change is detected using speaker-independent and speaker-adaptive Gaussian mixture models (GMMs). Both phone HMMs and GMMs are incremen...
متن کامل